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Amendments to the Claims: 

This listing of claims will replace all prior versions, and listings, of claims in 
the application: 

Ltetlno of Claima: 

1 . (Currently amended) A method, comprising: 
identifying a loop in a program; 
identifying each vector memory reference in the loop; 
determining dependencies between vector memory references in the loop, 

including detemiining unidirectional and circular dependencies; and 
reducing cacho thrashing by distributing the vector memory references into 
a plurality of detail loops configured to allocate the vector memory 
references into a plurality of temporary arrays, sized and located, so 
that none of the vector memory references are cache synonyms, 
wherein the vector memory references that have circular 
dependencies therebetween are included in a common detail loop, 
and wherein the detail loops are ordered according to the 
unidirectional dependencies between the memory references; 
analvzino an execution profile of the program after said dlstributino: and 
based on the execution profile, determining whe ther to repeat said 
identifvina a Icop. said identifvina each vecto r memory reference. 
said determining dependencies, an d said distributing, 

2. (Original) A method, as set forth in claim 1. further comprising allocating a 
plurality of temporary storage areas within a cache and determining the size of 
each temporary storage area based on the size of the cache and the number of 
temporary storage areas. 

3. (Original) A method, as set forth in claim 1. further comprising at least one 
section loop including the pluralrty of detail loops. 
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4. (Original) A method, as set forth in claim 1. wherein distributing the vector 
memory references into a plurality of detail loops further comprises distributing 
the vector memory references into a plurality of detail loops that each contain at 
least one vector memory reference that could benefit from cache management. 

5. (Original) A method, as set forth in claim 1, further comprising inserting 
cache management instructions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory. 

6. (Original) A method, as set forth in claim 1, further comprising inserting 
prefetch instructions into at least one of said detail loops to control movement of 
data associated with the vector memory reference between a cache and main 
memory. 

7. (Original) A method, as set forth in claim 1, further comprising perfomning 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

8. (Original) A method, as set forth in claim 1, further comprising inserting at 
least one of a prefetch instruction and a cache management instruction into at 
least one of said detail loops to control movement of data associated with the 
vector memory reference between a cache and main memory, and peri=orming 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

9. (Currently amended) A method, comprising: 
identifying a loop in a program; 

identifying each vector memory reference in the loop;. 
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determining dependencies between vector memory references in the 
loop; and 

roduc i ng oaoho thrash i ng by distributing the vector memory references into 
a plurality of detail loops that serially proceed through strips of the 
vector memory references and store the strips in temporary anrays 
so that none of the vector memory references are cache synonyms, 
wherein the vector memory references that have dependencies 
therebetween are included in a common detail loop; 

wP^rein said distributino the vector mem on/ references into a plurality of 
detail boos is perfonned bv a fi rst computer for execution by a 
second computer . 

10. (Original) A method, as set forth in claim 9, further comprising allocating a 
plurality of temporary storage areas within a cache and determining the size of 
each temporary storage area based on the size of the cache and the number of 
temporary storage areas. 

11. (Original) A method, as set forth in daim 9, further comprising at least one 
section loop including the plurality of detail loops. 

12. (Original) A method, as set forth in claim 9, wherein distributing the vector 
memory references Into a plurality of detail loops further comprises distributing 
the vector memory references into a plurality of detail loops that each contain at 
least one vector memory reference that could benefit from cache management 

13. (Original) A method, as set forth in claim 9. further comprising inserting 
cache management instructions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory. 
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14. (Original) A method, as set forth in claim 9, further comprising inserting 
prefetch instructions into at least one of said detail loops to control movement of 
data associated with the vector memory reference between a cache and main 
memory. 

15. (Original) A method, as set forth in claim 9, further comprising performing 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

16. (Original) A method, as set forth in claim 9, further comprising inserting at 
least one of a prefetch instnjction and a cache management instruction into at 
least one of said detail loops to control movement of data associated with the 
vector memory reference between a cache and main memory, and perfonming 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

17. (Currently amended) A method, comprising: 
Identifying a loop in a program; 
Identifying each vector memory reference in the loop; 
determining dependencies between vector memory references in the 

!oop;-aB4 

reducing cache thraohing by distributing the vector memory references into 
a plurality of detail loops in response to cache behavior and the 
dependencies between the vector memory references in the loop, 
wherein the detail loops cause storage of the vector memory 
references in temporary an-ays that are allocated consecutively so 
that no temporary arrays elements are cache synonyms^ 
wherein said identifvino a Iood. said Identifvina each vector memory 
reference, said determining deoende ncies between vector memon/ 
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references and said distributing the vector me mory references into 
a Plurality of detail loops produce code tliat is substantially 
independent of a comput er architecture: and 
performing code optimizations that are dependent on a computer 
architecture after said distributing . 

18. (Original) A method, as set forth in claim 17, v»^herein distributing the 
vector memory references further comprises distributing the vector memory 
references into the plurality of detail loops with each loop having at least one of 
the identified vector memory references. 

19. (Original) A method, as set forth in claim 17, further comprising 
determining dependencies between vector memory references in the loop, and 
wherein distributing the loop includes distributing the vector memory references 
into the plurality of detail loops, wherein the vector memory references that have 
circular dependencies therebetween are Included in a common detail loop. 

20. (Original) A method, as set forth in claim 17, further comprising inserting 
cache management instructions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory. 

21. (Original) A method, as set forth in claim 17, further comprising inserting 
prefetch instructions into at least one of said detail loops to control movement of 
data associated with the vector memory reference between a cache and main 
memory. 

22. (Original) A method, as set forth in claim 17, further comprising performing 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 
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23. (Original) A metliod. as set forth in claim 17, further comprising inserting 
at least one of a prefetch instruction and a cache management instruction into at 
least one of said detail loops to control movement of data associated with the 
vector memory reference between a cache and main memory, and performing 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

24. (Currently amended) A computer programmed to perform a method, 

comprising: 

identifying a loop in a program; 
identifying each vector memory reference in the loop; 
determining dependencies between vector memory references in the 
loop; and 

reduc i ng cacho thraoh i ng by distributing the vector memory references into 
a plurality of detail loops configured to retrieve strips of the vector 
memory references and store the strips in temporary an-ays, 
wherein the vector memory references that have circular 
dependencies therebetween are included in a common detail loop, 
»A,pmin thfi temporar y arrays are configured to simultaneously fit in a 
single cache bank . 

25. (Currently amended) A program storage medium encoded with 
instructions that, when executed by a computer, perform a method, comprising: 
identifying a loop in a program; 
identifying each vector memory reference in the loop; 
determining dependencies between vector memory references in the 
loop; and 

luUuL li i u >■ » hr thrn ih i n g h y g-n— 'n °" ^vp«"ded code of the program 

felLdistributing the vector memory references into a plurality of detail 

loops configured to allocate the vector memory references into 
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temporary arrays that avoid cache synonyms, wherein the vector 
memory references that have circular dependencies therebetween 
are Included in a common detail loop^ 
vA/^ftrPln the expanded code i s substantialiv Independent of computer 

architectures . 

26. (Currently amended) A compiler, comprising: 

means for identifying a loop in a program; 

means for identifying each vector memory reference In the loop; 

means for determining dependencies between vector memory references 
in the loop, including detennining unidirectional and circular 
dependencies; and 

means for roduoing oaoho throohing by distributing the vector memory 
references into a plurality of detail loops configured to serially 
process strips of the vector memory references so that thrashing 
does not occur, wherein the vector memory references that have 
circular dependencies therebetween are included In a common 
detail loop, and wherein the detail loops are ordered according to 
the unidirectional dependencies between the memory references; 

m.>ans for determining an executio n pmfile of the program after ?9id 
<;iistributina occurs: and 

mftans for selectively repeating use of said means for identifying ^ joop . 
said means fnr identifying ftar.h vector memory reference, said 
fYieans for determining d e p'^^nriandes. said means for distributing 
the vector m'emorv refere n nes into a plurality of detail loop? , aricj 
said means for determini nn an executif^n profile based on said 
flxecution profile . 

27. (Original) A compiler, as set forth in claim 26. further comprising means 
for allocating a plurality of temporary storage areas within a cache and 
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determining the size of each temporary storage area based on the size of the 
cache and the number of temporary storage areas. 

28. (Original) A compiler, as set forth in claim 26. wherein the means for 
distributing the vector memory references into a plurality of detail loops further 
comprises distributing the vector memory references into a plurality of detail loops 
that each contain at least one vector memory reference that could benefit from 
cache management. 

29. (Original) A compiler, as set forth in claim 26, further comprising inserting 
cache management instructions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory. 

30. (Original) A compiler, as set forth in claim 26. further comprising means 
for inserting prefetch instmctions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory. 

31. (Original) A compiler, as set forth in claim 26, further comprising means 
for performing loop unrolling on at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory. 

32. (Original) A compiler, as set forth in claim 26, further comprising means 
for Inserting at least one of a prefetch instruction and a cache management 
instmctlon into at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory, and perfomning loop unrolling on at least one of said detail loops to 
control movement of data associated with the vector memory reference between 
a cache and main memory. 
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33. (Currently amended) A method for reducing the likelihood of cache 
thrashing by software to be executed on a computer system having a cache, 
comprising: 

executing the software on the computer system; 

generating a profile indicating the manner in which the software uses the 
cache; 

identifying a portion of the software that exhibits cache thrashing based on 
the profile data; and 

modifying the identified portion of the software to reduce the likelihood of 
cache thrashing by distributing cache synonyms into detail loops 
configured to allocate the cache synonyms into temporary storage 
areas, sized and located, to prevent cache thrashing^ 

wherein said modifying occurs before optimizations th at are based on an 
architecture of the computer system . 

34. (Previously presented) A method, as set forth in claim 33. wherein 
modifying the identified portion of the software to reduce the likelihood of cache 
thrashing further comprises: 

identifying a loop in the identified portion of the software; 

Identifying each vector memory reference in the identified loop; 

determining dependencies between the vector memory references in the 
Identified loop of the software, Including determining unidirectional 
and circular dependencies; and 

reducing cache thrashing by distributing the vector memory references into 
a plurality of detail loops, wherein the vector memory references 
that have circular dependencies therebetween are included in a 
common detail loop, and wherein the detail loops are ordered 
according to the unidirectional dependencies between the memory 
references. 
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35. (Currently amended) A nnethod for reducing the likelihood of cache 
thrashing by software to be executed on a computer system having a cache, 
comprising: 

executing the software on the computer system; 

generating a profile indicating the manner in which the memory references 

of the software use the cache; 
identifying a portion of the memory references based on the profile, 

wherein the portion of the memory references is determined to 

cause cache thrashing; and 
reducing cache thrashing by distributing the portion of the memory 

references into distinct loops that allocate strips of the memory 

references into temporary arrays for execution^ 
wherein the temporary arrays are con f ioured to simultaneously fit in a 

ainq le cache bank . 

36. (Previously presented) The computer of claim 24 wherein the temporary 
arrays are allocated consecutively such that no temporary array elements are 
cache synonyms. 

37. (Previously presented) The computer of claim 24 wherein the details loops 
are allocated into section loops that cause iterative execution of the detail loops 
based on a size of the strips. 

38. (Prevkjusly presented) The program storage medium of claim 25 wherein 
the temporary arrays are allocated consecutively such that no temporary array 
elements are cache synonyms. 

39. (Previously presented) The method of claim 35 wherein the temporary 
arrays are located and sized to reduce cache thrashing. 



Pane 11 of 17 HP PDNO 200301732-1 

PACE 11/19' RCVD AT 4/26/2005 4:55:30 PM [Eastern DayUgtTt T»nel ■ 8VR:USPTO-CFXRF-1/5 " I>raS:872930B • <:SID:713238SO0S " DURATION (inm-ss):07-24 



04/26/2005 



15:59 FAX 



7132388008 



81014/019 



Appl. No. 09/785,143 

Amdt dated April 26, 2005 

Reply to Office action of January 26, 2005 



40. (Previously presented) The method of claim 35 wherein the temporary 
arrays are allocated consecutively and iteratively executed by a size of the strips. 
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